[MLOB-1555] add LLMObs writers #4699

sabrenner · 2024-09-18T18:27:38Z

What does this PR do?

Adds LLM Observability writers for span events (agentless and agent proxy) as well as evaluation metrics (which write directly to our public API).

Important Notes

These writers will run on intervals separate from the main agent exporter, and in a future PR will be initialized in the appropriate spots to start those intervals (as defined in the constructor of the base writer). Because of this, these writers specifically won't interact with any tracer internal exporters, writers, or encoders.
We need to make sure unicode special characters are encoded in their \\u form in payload strings. I wasn't sure if there was a cleaner way to do this, so any input on this is appreciated!

Motivation

Merge in incremental change of LLMObs writers into the LLM Observability SDK release branch.

The timeline of changes to merge looks like (in order):

datadog-datadog-prod-us1 · 2024-09-18T18:27:45Z

.github/workflows/llmobs.yml

+      - run: yarn test:llmobs:ci
+      - if: always()
+        uses: ./.github/actions/testagent/logs
+      - uses: codecov/codecov-action@v3


🟠 Code Vulnerability

Workflow depends on a GitHub actions pinned by tag (...read more)

Pin third party actions by hash, or at least by tag for trusted sources

When using a third party action, one needs to provide its GitHub path (owner/project) and can eventually pin it to a git ref (a branch name, a git tag, or a commit hash).

No pinned git ref means the action will use the latest commit of the default branch each time it runs, eventually running newer versions of the code that were not audited by Datadog. Specifying a git tag is better, but since they are not immutable, using a full length hash is recommended to make sure the action content is actually frozen to some reviewed state.

Be careful however, as even pinning an action by hash can be circumvented by attackers still. For instance, if an action relies on a Docker image which is itself not pinned to a digest, it becomes possible to alter its behaviour through the Docker image without actually changing its hash. You can learn more about this kind of attacks in Unpinnable Actions: How Malicious Code Can Sneak into Your GitHub Actions Workflows. Pinning actions by hash is still a good first line of defense against supply chain attacks.

Additionally, pinning by hash or tag means the action won’t benefit from newer version updates if any, including eventual security patches. Make sure to regularly check if newer versions for an action you use are available. For actions coming from a very trustworthy source, it can make sense to use a laxer pinning policy to benefit from updates as soon as possible.

datadog-datadog-prod-us1 · 2024-09-18T18:27:45Z

.github/workflows/llmobs.yml

@@ -0,0 +1,30 @@
+name: LLMObs


🟠 Code Vulnerability

No explicit permissions set for at the workflow level (...read more)

Check the permissions granted to jobs

Datadog’s GitHub organization defines default permissions for the GITHUB_TOKEN to be restricted (contents:read, metadata:read and packages:read).

Your repository may require different setup, but please consider defining permissions for each job following the least privilege principle to restrict the impact of a possible compromission.

You can find the list of all possible permissions in Workflow syntax for GitHub Actions - GitHub Docs. Please note they can be defined at the job or the workflow level.

datadog-datadog-prod-us1 · 2024-09-18T18:27:45Z

.github/workflows/llmobs.yml

+  ubuntu:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4


🟠 Code Vulnerability

Workflow depends on a GitHub actions pinned by tag (...read more)

Pin third party actions by hash, or at least by tag for trusted sources

When using a third party action, one needs to provide its GitHub path (owner/project) and can eventually pin it to a git ref (a branch name, a git tag, or a commit hash).

No pinned git ref means the action will use the latest commit of the default branch each time it runs, eventually running newer versions of the code that were not audited by Datadog. Specifying a git tag is better, but since they are not immutable, using a full length hash is recommended to make sure the action content is actually frozen to some reviewed state.

Be careful however, as even pinning an action by hash can be circumvented by attackers still. For instance, if an action relies on a Docker image which is itself not pinned to a digest, it becomes possible to alter its behaviour through the Docker image without actually changing its hash. You can learn more about this kind of attacks in Unpinnable Actions: How Malicious Code Can Sneak into Your GitHub Actions Workflows. Pinning actions by hash is still a good first line of defense against supply chain attacks.

Additionally, pinning by hash or tag means the action won’t benefit from newer version updates if any, including eventual security patches. Make sure to regularly check if newer versions for an action you use are available. For actions coming from a very trustworthy source, it can make sense to use a laxer pinning policy to benefit from updates as soon as possible.

github-actions · 2024-09-18T18:28:29Z

Overall package size

Self size: 7.17 MB
Deduped: 62.53 MB
No deduping: 62.81 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.4.1 | 2.14 MB | 2.23 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 9.0.0 | 580.4 kB | 1.03 MB | | import-in-the-middle | 1.8.1 | 71.67 kB | 785.15 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

…ters

pr-commenter · 2024-09-18T18:46:53Z

Benchmarks

Benchmark execution time: 2024-09-19 18:46:59

Comparing candidate commit ce7e950 in PR branch sabrenner/llmobs-writers with baseline commit 54c8eec in branch sabrenner/llmobs-sdk-release.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

Yun-Kim

LGTM from team mlobs, just some small suggestions / clarification questions

packages/dd-trace/src/llmobs/writers/evaluations.js

packages/dd-trace/src/llmobs/writers/spans/agentless.js

Yun-Kim · 2024-09-19T18:15:36Z

packages/dd-trace/src/llmobs/writers/base.js

+      if (typeof value === 'string') {
+        return encodeUnicode(value) // serialize unicode characters
+      }
+      return value


Just for clarification, can you explain what exactly's happening here? Does json.stringify() get called first then we run the encodeUnicode() helper on the result afterwards?

it gets run as JSON.stringify is happening. when passing a callback function to JSON.stringify, it'll execute that function over any values in the object. since we need to encode unicode characters (ie – → \u2013) for our decoder on ingestion, this function will make sure we encode those special characters with the correct unicode value (I think json.dumps does this for us on the Python SDK, but JSON.stringify doesn't do it by default here). There might be a better approach for this, will wait for Node.js folks input on that.

sabrenner added 2 commits September 18, 2024 10:57

writers

0125bbf

tests

185fbe3

datadog-datadog-prod-us1 bot reviewed Sep 18, 2024

View reviewed changes

Merge branch 'sabrenner/llmobs-sdk-release' into sabrenner/llmobs-wri…

4a85292

…ters

sabrenner marked this pull request as ready for review September 19, 2024 14:21

sabrenner requested a review from a team as a code owner September 19, 2024 14:21

Yun-Kim approved these changes Sep 19, 2024

View reviewed changes

make agentless spans and eval metrics endpoints constants

ce7e950

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLOB-1555] add LLMObs writers #4699

[MLOB-1555] add LLMObs writers #4699

sabrenner commented Sep 18, 2024 •

edited

Loading

datadog-datadog-prod-us1 bot Sep 18, 2024

datadog-datadog-prod-us1 bot Sep 18, 2024

datadog-datadog-prod-us1 bot Sep 18, 2024

github-actions bot commented Sep 18, 2024 •

edited

Loading

pr-commenter bot commented Sep 18, 2024 •

edited

Loading

Yun-Kim left a comment

Yun-Kim Sep 19, 2024

sabrenner Sep 19, 2024

[MLOB-1555] add LLMObs writers #4699

Are you sure you want to change the base?

[MLOB-1555] add LLMObs writers #4699

Conversation

sabrenner commented Sep 18, 2024 • edited Loading

What does this PR do?

Important Notes

Motivation

datadog-datadog-prod-us1 bot Sep 18, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Pin third party actions by hash, or at least by tag for trusted sources

datadog-datadog-prod-us1 bot Sep 18, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Check the permissions granted to jobs

datadog-datadog-prod-us1 bot Sep 18, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Pin third party actions by hash, or at least by tag for trusted sources

github-actions bot commented Sep 18, 2024 • edited Loading

Overall package size

pr-commenter bot commented Sep 18, 2024 • edited Loading

Benchmarks

Yun-Kim left a comment

Choose a reason for hiding this comment

Yun-Kim Sep 19, 2024

Choose a reason for hiding this comment

sabrenner Sep 19, 2024

Choose a reason for hiding this comment

sabrenner commented Sep 18, 2024 •

edited

Loading

github-actions bot commented Sep 18, 2024 •

edited

Loading

pr-commenter bot commented Sep 18, 2024 •

edited

Loading